Searching Multi-Rate and Multi-Modal Temporal Enhanced Networks for Gesture Recognition

نویسندگان

چکیده

Gesture recognition has attracted considerable attention owing to its great potential in applications. Although the progress been made recently multi-modal learning methods, existing methods still lack effective integration fully explore synergies among spatio-temporal modalities effectively for gesture recognition. The problems are partially due fact that manually designed network architectures have low efficiency joint of multi-modalities. In this paper, we propose first neural architecture search (NAS)-based method RGB-D proposed includes two key components: 1) enhanced temporal representation via 3D Central Difference Convolution (3D-CDC) family, which is able capture rich context aggregating difference information; and 2) optimized backbones multi-sampling-rate branches lateral connections varied modalities. resultant multi-rate provides a new perspective understand relationship between RGB depth their dynamics. Comprehensive experiments performed on three benchmark datasets (IsoGD, NvGesture, EgoGesture), demonstrating state-of-the-art performance both single- multi-modality settings. code available at https://github.com/ZitongYu/3DCDC-NAS .

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Challenges in Multi-modal Gesture Recognition

This paper surveys the state of the art on multimodal gesture recognition and introduces the JMLR special topic on gesture recognition 2011-2015. We began right at the start of the KinectT Mrevolution when inexpensive infrared cameras providing image depth recordings became available. We published papers using this technology and other more conventional methods, including regular video cameras,...

متن کامل

Bayesian Co-Boosting for Multi-modal Gesture Recognition

With the development of data acquisition equipment, more and more modalities become available for gesture recognition. However, there still exist two critical issues for multimodal gesture recognition: how to select discriminative features for recognition and how to fuse features from different modalities. In this paper, we propose a novel Bayesian Co-Boosting framework for multi-modal gesture ...

متن کامل

Multi-modal Integration for Gesture and Speech

Demonstratives, in particular gestures that “only” accompany speech, are not a big issue in current theories of grammar. If we deal with gestures, fixing their function is one big problem, the other one is how to integrate the representations originating from different channels and, ultimately, how to determine their composite meanings. The growing interest in multi-modal settings, computer sim...

متن کامل

tight frame approximation for multi-frames and super-frames

در این پایان نامه یک مولد برای چند قاب یا ابر قاب تولید شده تحت عمل نمایش یکانی تصویر برای گروه های شمارش پذیر گسسته بررسی خواهد شد. مثال هایی از این قاب ها چند قاب های گابور، ابرقاب های گابور و قاب هایی برای زیرفضاهای انتقال پایاست. نشان می دهیم که مولد چند قاب تنک نرمال شده (ابرقاب) یکتا وجود دارد به طوری که مینیمم فاصله را از ان دارد. همچنین مسایل مشابه برای قاب های دوگان مطرح شده و برخی ...

15 صفحه اول

Hybridization of Facial Features and Use of Multi Modal Information for 3D Face Recognition

Despite of achieving good performance in controlled environment, the conventional 3D face recognition systems still encounter problems in handling the large variations in lighting conditions, facial expression and head pose The humans use the hybrid approach to recognize faces and therefore in this proposed method the human face recognition ability is incorporated by combining global and local ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE transactions on image processing

سال: 2021

ISSN: ['1057-7149', '1941-0042']

DOI: https://doi.org/10.1109/tip.2021.3087348